Partitioned Blockmap Indexes for Multidimensional Data Access
نویسندگان
چکیده
Given recent increases in the size of main memory in modern machines, it is now common to to store large data sets in RAM for faster processing. Multidimensional access methods aim to provide efficient access to large data sets when queries apply predicates to some of the data dimensions. We examine multidimensional access methods in the context of an in-memory column store tuned for on-line analytical processing or scientific data analysis. We propose a multidimensional data structure that contains a novel combination of a grid array and several bitmaps. The base data is clustered in an order matching that of the index structure. The bitmaps contain one bit per block of data, motivating the term “blockmap.” The proposed data structures are compact, typically taking less than one bit of space per row of data. Partition boundaries can be chosen in a way that reflects both the query workload and the data distribution, and boundaries are not required to evenly divide the data if there is a bias in the query distribution. We examine the theoretical performance of the data structure and experimentally measure its performance on three modern CPUs and one GPU processor. We demonstrate that efficient multidimensional access can be achieved with minimal space overhead.
منابع مشابه
Institut für Informatik der Technischen Universität München MISTRAL : Processing Relational Queries using a Multidimensional Access Technique
A multidimensional access method offering significant performance increases by intelligently partitioning the query space is applied to relational database management systems (RDBMS). We introduce a formal model for multidimensional partitioned relations and discuss several typical query patterns. The model identifies the significance of multidimensional range queries and sort operations. The d...
متن کاملMISTRAL: Processing Relational Queries using a Multidimensional Access Technique
A multidimensional access method offering significant performance increases by intelligently partitioning the query space is applied to relational database management systems (RDBMS). We introduce a formal model for multidimensional partitioned relations and discuss several typical query patterns. The model identifies the significance of multidimensional range queries and sort operations. The d...
متن کاملTowards a general purpose, multidimensional index: integration, optimization, and enhancement of UB-trees
Multidimensional access methods are considered to be a promising approach for providing acceptable performance to analysis-centric applications. However, despite the large body of research work in this field, the commercial support for multidimensional indexes is still very weak. The reason for this discrepancy is threefold: first, no standard multidimensional index like the B-Tree for one-dime...
متن کاملParallelizing multidimensional indexes for main memory databases
Parallelizing multidimensional indexes for main memory databases Master thesis,
متن کاملBounding Predicates and Insertion Policies for Multidimensional Indexes
We present two new techniques for improving the performance of multidimensional indexes. For static data sets, we nd that bulk loading techniques are e ective at clustering data items in the index; however, traditional designs of an index's bounding predicates can lead to poor performance. We develop and implement in GiST three new bounding predicates, two of which have much better performance ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011